Supervised Reinforcement Learning via Value Function
نویسندگان
چکیده
منابع مشابه
Convergent Reinforcement Learning with Value Function Interpolation
We consider the convergence of a class of reinforcement learning algorithms combined with value function interpolation methods using the methods developed in (Littman & Szepesvári, 1996). As a special case of the obtained general results, for the first time, we prove the (almost sure) convergence of Qlearning when combined with value function interpolation in uncountable spaces.
متن کاملValue-function reinforcement learning in Markov games
Markov games are a model of multiagent environments that are convenient for studying multiagent reinforcement learning. This paper describes a set of reinforcement-learning algorithms based on estimating value functions and presents convergence theorems for these algorithms. The main contribution of this paper is that it presents the convergence theorems in a way that makes it easy to reason ab...
متن کاملValue-Aware Loss Function for Model Learning in Reinforcement Learning
We consider the problem of estimating the transition probability kernel to be used by a model-based reinforcement learning (RL) algorithm. We argue that estimating a generative model that minimizes a probabilistic loss, such as the log-loss, might be an overkill because such a probabilistic loss does not take into account the underlying structure of the decision problem and the RL algorithm tha...
متن کاملEfficient Reinforcement Learning in Deterministic Systems with Value Function Generalization
Authors are encouraged to submit new papers to INFORMS journals by means of a style file template, which includes the journal title. However, use of a template does not certify that the paper has been accepted for publication in the named journal. INFORMS journal templates are for the exclusive purpose of submitting to an INFORMS journal and should not be used to distribute the papers in print ...
متن کاملManaging Uncertainty within Value Function Approximation in Reinforcement Learning
The dilemma between exploration and exploitation is an important topic in reinforcement learning (RL). Most successful approaches in addressing this problem tend to use some uncertainty information about values estimated during learning. On another hand, scalability is known as being a lack of RL algorithms and value function approximation has become a major topic of research. Both problems ari...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Symmetry
سال: 2019
ISSN: 2073-8994
DOI: 10.3390/sym11040590